A faster parsing algorithm for Lexicalized Tree-Adjoining Grammars
نویسندگان
چکیده
This paper points out some computational inefficiencies of standard TAG parsing algorithms when applied to LTAGs. We propose a novel algorithm with an asymptotic improvement, from to , where is the input length and are grammar constants that are independent of vocabulary size. Introduction Lexicalized Tree-Adjoining Grammars (LTAGs) were first introduced in (Schabes et al., 1988) as a variant of Tree-Adjoining Grammars (TAGs) (Joshi, 1987). In LTAGs each elementary tree is specialized for some individual lexical item. Following the original proposal, LTAGs have been used in several state-of-the-art, real-world parsers; see for instance (Abeillé & Candito, 2000) and (Doran et al., 2000). Like link grammar (Sleator & Temperley, 1991) and lexicalized formalisms from the statistical parsing literature (Collins, 1997; Charniak, 1997; Alshawi, 1996; Eisner, 1996) LTAGs provide two main recognized advantages over more standard non-lexicalized formalisms: ! subcategorization can be specified separately for each word; and ! each word can restrict the anchors (head words) of its arguments and adjuncts, encoding lexical preferences as well as some effects of semantics and world knowledge. To give a simple example, consider the verb walk, which is usually intransitive but can take an object in some restricted cases. An LTAG can easily specify the acceptability of sentence Mary walks the dog by associating walk with a transitive elementary tree that selects for an indirect object tree anchored at word dog (and some other words within a limited range).
منابع مشابه
Practical experiments in parsing using Tree Adjoining Grammars
We present an implementation of a chart-based head-corner parsing algorithm for lexicalized Tree Adjoining Grammars. We report on some practical experiments where we parse 2250 sentences from the Wall Street Journal using this parser. In these experiments the parser is run without any statistical pruning; it produces all valid parses for each sentence in the form of a shared derivation forest. ...
متن کاملSome Experiments on Indicators of Parsing Complexity for Lexicalized Grammars
In this paper, we identify syntactic lexical ambiguity and sentence complexity as factors that contribute to parsing complexity in fully lexicalized grammar formalisms such as Lexicalized Tree Adjoining Grammars. We also report on experiments that explore the effects of these factors on parsing complexity. We discuss how these constraints can be exploited in improving efficiency of parsers for ...
متن کاملLexicalization and Grammar Development
In this paper we present a fully lexicalized grammar formalism as a particularly attractive framework for the specification of natural language grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We illustrate the advantages of lexicalized grammars in various contexts of natural language processing,...
متن کاملLexicalization and Grammar Development Lexicalization and Grammar Development
In this paper we present a fully lexicalized grammar formalism as a particularly attractive framework for the specification of natural language grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We illustrate the advantages of lexicalized grammars in various contexts of natural language processing,...
متن کاملAn improved Earley parser with LTAG
This paper presents an adaptation of the Earley algorithm (EARLEY, 1968) for parsing with lexicalized tree-adjoining grammars (LTAGs). This algorithm constructs the derivation tree following a top-down strategy and verifies the valid prefix property. Many earlier algorithm do not have both of this properties (ScHABES, 1994). The Earley-like algorithm described in (SCHABES and Josm, 1988) verifi...
متن کامل